Search Results: "justin"

12 March 2022

Dirk Eddelbuettel: Rcpp 1.0.8.2: Hotfix release per CRAN request

rcpp logo A new hot-fix release 1.0.8.2 of Rcpp just got to CRAN. It will also be uploaded to Debian shortly, and Windows and macOS binaries will appear at CRAN in the next few days. This release breaks with the six-months cycle started with release 1.0.5 in July 2020 as CRAN desired an update to silence nags from the newest clang version which turned a little loud over a feature deprecated in C++11 (namely std::unary_function() and std::binary_function()). This was easy to replace with std::function() which we did. The release also contains a minor bugfix relative to 1.0.8 and C++98 builds, and minor correction to one pdf vignette. The release was fully tested by us and CRAN as usual against all reverse dependencies. Rcpp has become the most popular way of enhancing R with C or C++ code. Right now, around 2519 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 239 in BioConductor. The full list of details for this interim release follows.

Changes in Rcpp hotfix release version 1.0.8.2 (2022-03-10)
  • Changes in Rcpp API:
    • Accomodate C++98 compilation by adjusting attributes.cpp (Dirk in #1193 fixing #1192)
    • Accomodate newest compilers replacing deprecated std::unary_function and std::binary_function with std::function (Dirk in #1202 fixing #1201 and CRAN request)
  • Changes in Rcpp Documentation:
    • Adjust one overflowing column (Bill Denney in #1196 fixing #1195)
  • Changes in Rcpp Deployment:
    • Accomodate four digit version numbers in unit test (Dirk)

Thanks to my CRANberries, you can also look at a diff to the previous release. Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues); questions are also welcome under rcpp tag at StackOverflow which also allows searching among the (currently) 2843 previous questions. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

30 January 2022

Russell Coker: Links Jan 2022

Washington Post has an interesting article on how gender neutral language is developing in different countries [1]. pimaker has an interesting blog post about how they wrote a RISCV CPU emulator to boot a Linux kernel in a pixel shader in the VR Chat platform [2]. ZD has an interesting article about the new Solo Bumblebee platform for writing EBPF programs to run inside the Linux kernel [3]. EBPF is an interesting platform and it s good to have new tools to help people develop for it. Big Think has an interesting article about augmented reality suggesting that it could be worse than social media for driving disputes [4]. Some people would want tags of racial status on all people they see. Vice has an insightful article making the case for 8 hours of work per week [5]. The Guardian has an insightful article about how our attention is being stolen by modern technology [6]. Interesting article about the Ilobleed rootkit that targets the HP ILO server management system [7], it s apparently designed for persistent attacks as it bypasses the firmware upgrade process and updates only the version number so anyone who thinks that a firmware update will fix it is horribly mislead. Nick Bostrom wrote an insightful and disturbing article for Aeon titles None of Our Technologies has Managed to Destroy Humanity Yet [8] about the existential risk of new technologies and how they might be mitigated, much of which involves authoritarian governments unlike any we have seen before. As a counterpoint the novel A Deepness in the Sky claims that universal surveillance would be as damaging to the future of a society as planet-buster bombs. Euronews has an informative article about eco-fascism [9]. The idea that the solution to ecological problems is to have less people in the world and particularly less non-white people is a gateway from green politics to fascism. The developer of the colors and faker npm libraries (for NodeJS) recently uploaded corrupted versions of those libraries as a protest against companies that use them without paying him [10]. This is the wrong way to approach the issue, but it does demonstrate a serious problem with systems like NPM that allow automatic updates. It gives a better result to just use software packaged by a distribution which has QA checks applied including on security updates. VentureBeat has an interesting article on the way that AI is taking over jobs [11]. Lots of white collar jobs can be replaced by machine learning systems, we need to plan for the ways this will change the economy. The Atlantic has an interesting article about tapeworms that infect some ants [12]. The tapeworms make the ants live longer and produce pheromones to make the other ants serve them. This increases the chance that the ants will be eaten by birds which is the next stage in the tapeworm lifecycle.

23 January 2022

Antoine Beaupr : Switching from OpenNTPd to Chrony

A friend recently reminded me of the existence of chrony, a "versatile implementation of the Network Time Protocol (NTP)". The excellent introduction is worth quoting in full:
It can synchronise the system clock with NTP servers, reference clocks (e.g. GPS receiver), and manual input using wristwatch and keyboard. It can also operate as an NTPv4 (RFC 5905) server and peer to provide a time service to other computers in the network. It is designed to perform well in a wide range of conditions, including intermittent network connections, heavily congested networks, changing temperatures (ordinary computer clocks are sensitive to temperature), and systems that do not run continuosly, or run on a virtual machine. Typical accuracy between two machines synchronised over the Internet is within a few milliseconds; on a LAN, accuracy is typically in tens of microseconds. With hardware timestamping, or a hardware reference clock, sub-microsecond accuracy may be possible.
Now that's already great documentation right there. What it is, why it's good, and what to expect from it. I want more. They have a very handy comparison table between chrony, ntp and openntpd.

My problem with OpenNTPd Following concerns surrounding the security (and complexity) of the venerable ntp program, I have, a long time ago, switched to using openntpd on all my computers. I hadn't thought about it until I recently noticed a lot of noise on one of my servers:
jan 18 10:09:49 curie ntpd[1069]: adjusting local clock by -1.604366s
jan 18 10:08:18 curie ntpd[1069]: adjusting local clock by -1.577608s
jan 18 10:05:02 curie ntpd[1069]: adjusting local clock by -1.574683s
jan 18 10:04:00 curie ntpd[1069]: adjusting local clock by -1.573240s
jan 18 10:02:26 curie ntpd[1069]: adjusting local clock by -1.569592s
You read that right, openntpd was constantly rewinding the clock, sometimes in less than two minutes. The above log was taken while doing diagnostics, looking at the last 30 minutes of logs. So, on average, one 1.5 seconds rewind per 6 minutes! That might be due to a dying real time clock (RTC) or some other hardware problem. I know for a fact that the CMOS battery on that computer (curie) died and I wasn't able to replace it (!). So that's partly garbage-in, garbage-out here. But still, I was curious to see how chrony would behave... (Spoiler: much better.) But I also had trouble on another workstation, that one a much more recent machine (angela). First, it seems OpenNTPd would just fail at boot time:
anarcat@angela:~(main)$ sudo systemctl status openntpd
  openntpd.service - OpenNTPd Network Time Protocol
     Loaded: loaded (/lib/systemd/system/openntpd.service; enabled; vendor pres>
     Active: inactive (dead) since Sun 2022-01-23 09:54:03 EST; 6h ago
       Docs: man:openntpd(8)
    Process: 3291 ExecStartPre=/usr/sbin/ntpd -n $DAEMON_OPTS (code=exited, sta>
    Process: 3294 ExecStart=/usr/sbin/ntpd $DAEMON_OPTS (code=exited, status=0/>
   Main PID: 3298 (code=exited, status=0/SUCCESS)
        CPU: 34ms
jan 23 09:54:03 angela systemd[1]: Starting OpenNTPd Network Time Protocol...
jan 23 09:54:03 angela ntpd[3291]: configuration OK
jan 23 09:54:03 angela ntpd[3297]: ntp engine ready
jan 23 09:54:03 angela ntpd[3297]: ntp: recvfrom: Permission denied
jan 23 09:54:03 angela ntpd[3294]: Terminating
jan 23 09:54:03 angela systemd[1]: Started OpenNTPd Network Time Protocol.
jan 23 09:54:03 angela systemd[1]: openntpd.service: Succeeded.
After a restart, somehow it worked, but it took a long time to sync the clock. At first, it would just not consider any peer at all:
anarcat@angela:~(main)$ sudo ntpctl -s all
0/20 peers valid, clock unsynced
peer
   wt tl st  next  poll          offset       delay      jitter
159.203.8.72 from pool 0.debian.pool.ntp.org
    1  5  2    6s    6s             ---- peer not valid ----
138.197.135.239 from pool 0.debian.pool.ntp.org
    1  5  2    6s    7s             ---- peer not valid ----
216.197.156.83 from pool 0.debian.pool.ntp.org
    1  4  1    2s    9s             ---- peer not valid ----
142.114.187.107 from pool 0.debian.pool.ntp.org
    1  5  2    5s    6s             ---- peer not valid ----
216.6.2.70 from pool 1.debian.pool.ntp.org
    1  4  2    2s    8s             ---- peer not valid ----
207.34.49.172 from pool 1.debian.pool.ntp.org
    1  4  2    0s    5s             ---- peer not valid ----
198.27.76.102 from pool 1.debian.pool.ntp.org
    1  5  2    5s    5s             ---- peer not valid ----
158.69.254.196 from pool 1.debian.pool.ntp.org
    1  4  3    1s    6s             ---- peer not valid ----
149.56.121.16 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
162.159.200.123 from pool 2.debian.pool.ntp.org
    1  4  3    1s    6s             ---- peer not valid ----
206.108.0.131 from pool 2.debian.pool.ntp.org
    1  4  1    6s    9s             ---- peer not valid ----
205.206.70.40 from pool 2.debian.pool.ntp.org
    1  5  2    8s    9s             ---- peer not valid ----
2001:678:8::123 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
2606:4700:f1::1 from pool 2.debian.pool.ntp.org
    1  4  3    2s    6s             ---- peer not valid ----
2607:5300:205:200::1991 from pool 2.debian.pool.ntp.org
    1  4  2    5s    9s             ---- peer not valid ----
2607:5300:201:3100::345c from pool 2.debian.pool.ntp.org
    1  4  4    1s    6s             ---- peer not valid ----
209.115.181.110 from pool 3.debian.pool.ntp.org
    1  5  2    5s    6s             ---- peer not valid ----
205.206.70.42 from pool 3.debian.pool.ntp.org
    1  4  2    0s    6s             ---- peer not valid ----
68.69.221.61 from pool 3.debian.pool.ntp.org
    1  4  1    2s    9s             ---- peer not valid ----
162.159.200.1 from pool 3.debian.pool.ntp.org
    1  4  3    4s    7s             ---- peer not valid ----
Then it would accept them, but still wouldn't sync the clock:
anarcat@angela:~(main)$ sudo ntpctl -s all
20/20 peers valid, clock unsynced
peer
   wt tl st  next  poll          offset       delay      jitter
159.203.8.72 from pool 0.debian.pool.ntp.org
    1  8  2    5s    6s         0.672ms    13.507ms     0.442ms
138.197.135.239 from pool 0.debian.pool.ntp.org
    1  7  2    4s    8s         1.260ms    13.388ms     0.494ms
216.197.156.83 from pool 0.debian.pool.ntp.org
    1  7  1    3s    5s        -0.390ms    47.641ms     1.537ms
142.114.187.107 from pool 0.debian.pool.ntp.org
    1  7  2    1s    6s        -0.573ms    15.012ms     1.845ms
216.6.2.70 from pool 1.debian.pool.ntp.org
    1  7  2    3s    8s        -0.178ms    21.691ms     1.807ms
207.34.49.172 from pool 1.debian.pool.ntp.org
    1  7  2    4s    8s        -5.742ms    70.040ms     1.656ms
198.27.76.102 from pool 1.debian.pool.ntp.org
    1  7  2    0s    7s         0.170ms    21.035ms     1.914ms
158.69.254.196 from pool 1.debian.pool.ntp.org
    1  7  3    5s    8s        -2.626ms    20.862ms     2.032ms
149.56.121.16 from pool 2.debian.pool.ntp.org
    1  7  2    6s    8s         0.123ms    20.758ms     2.248ms
162.159.200.123 from pool 2.debian.pool.ntp.org
    1  8  3    4s    5s         2.043ms    14.138ms     1.675ms
206.108.0.131 from pool 2.debian.pool.ntp.org
    1  6  1    0s    7s        -0.027ms    14.189ms     2.206ms
205.206.70.40 from pool 2.debian.pool.ntp.org
    1  7  2    1s    5s        -1.777ms    53.459ms     1.865ms
2001:678:8::123 from pool 2.debian.pool.ntp.org
    1  6  2    1s    8s         0.195ms    14.572ms     2.624ms
2606:4700:f1::1 from pool 2.debian.pool.ntp.org
    1  7  3    6s    9s         2.068ms    14.102ms     1.767ms
2607:5300:205:200::1991 from pool 2.debian.pool.ntp.org
    1  6  2    4s    9s         0.254ms    21.471ms     2.120ms
2607:5300:201:3100::345c from pool 2.debian.pool.ntp.org
    1  7  4    5s    9s        -1.706ms    21.030ms     1.849ms
209.115.181.110 from pool 3.debian.pool.ntp.org
    1  7  2    0s    7s         8.907ms    75.070ms     2.095ms
205.206.70.42 from pool 3.debian.pool.ntp.org
    1  7  2    6s    9s        -1.729ms    53.823ms     2.193ms
68.69.221.61 from pool 3.debian.pool.ntp.org
    1  7  1    1s    7s        -1.265ms    46.355ms     4.171ms
162.159.200.1 from pool 3.debian.pool.ntp.org
    1  7  3    4s    8s         1.732ms    35.792ms     2.228ms
It took a solid five minutes to sync the clock, even though the peers were considered valid within a few seconds:
jan 23 15:58:41 angela systemd[1]: Started OpenNTPd Network Time Protocol.
jan 23 15:58:58 angela ntpd[84086]: peer 142.114.187.107 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 198.27.76.102 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 207.34.49.172 now valid
jan 23 15:58:58 angela ntpd[84086]: peer 209.115.181.110 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 159.203.8.72 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 138.197.135.239 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 162.159.200.123 now valid
jan 23 15:58:59 angela ntpd[84086]: peer 2607:5300:201:3100::345c now valid
jan 23 15:59:00 angela ntpd[84086]: peer 2606:4700:f1::1 now valid
jan 23 15:59:00 angela ntpd[84086]: peer 158.69.254.196 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 216.6.2.70 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 68.69.221.61 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 205.206.70.40 now valid
jan 23 15:59:01 angela ntpd[84086]: peer 205.206.70.42 now valid
jan 23 15:59:02 angela ntpd[84086]: peer 162.159.200.1 now valid
jan 23 15:59:04 angela ntpd[84086]: peer 216.197.156.83 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 206.108.0.131 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 2001:678:8::123 now valid
jan 23 15:59:05 angela ntpd[84086]: peer 149.56.121.16 now valid
jan 23 15:59:07 angela ntpd[84086]: peer 2607:5300:205:200::1991 now valid
jan 23 16:03:47 angela ntpd[84086]: clock is now synced
That seems kind of odd. It was also frustrating to have very little information from ntpctl about the state of the daemon. I understand it's designed to be minimal, but it could inform me on his known offset, for example. It does tell me about the offset with the different peers, but not as clearly as one would expect. It's also unclear how it disciplines the RTC at all.

Compared to chrony Now compare with chrony:
jan 23 16:07:16 angela systemd[1]: Starting chrony, an NTP client/server...
jan 23 16:07:16 angela chronyd[87765]: chronyd version 4.0 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP +SCFILTER +SIGND +ASYNCDNS +NTS +SECHASH +IPV6 -DEBUG)
jan 23 16:07:16 angela chronyd[87765]: Initial frequency 3.814 ppm
jan 23 16:07:16 angela chronyd[87765]: Using right/UTC timezone to obtain leap second data
jan 23 16:07:16 angela chronyd[87765]: Loaded seccomp filter
jan 23 16:07:16 angela systemd[1]: Started chrony, an NTP client/server.
jan 23 16:07:21 angela chronyd[87765]: Selected source 206.108.0.131 (2.debian.pool.ntp.org)
jan 23 16:07:21 angela chronyd[87765]: System clock TAI offset set to 37 seconds
First, you'll notice there's none of that "clock synced" nonsense, it picks a source, and then... it's just done. Because the clock on this computer is not drifting that much, and openntpd had (presumably) just sync'd it anyways. And indeed, if we look at detailed stats from the powerful chronyc client:
anarcat@angela:~(main)$ sudo chronyc tracking
Reference ID    : CE6C0083 (ntp1.torix.ca)
Stratum         : 2
Ref time (UTC)  : Sun Jan 23 21:07:21 2022
System time     : 0.000000311 seconds slow of NTP time
Last offset     : +0.000807989 seconds
RMS offset      : 0.000807989 seconds
Frequency       : 3.814 ppm fast
Residual freq   : -24.434 ppm
Skew            : 1000000.000 ppm
Root delay      : 0.013200894 seconds
Root dispersion : 65.357254028 seconds
Update interval : 1.4 seconds
Leap status     : Normal
We see that we are nanoseconds away from NTP time. That was ran very quickly after starting the server (literally in the same second as chrony picked a source), so stats are a bit weird (e.g. the Skew is huge). After a minute or two, it looks more reasonable:
Reference ID    : CE6C0083 (ntp1.torix.ca)
Stratum         : 2
Ref time (UTC)  : Sun Jan 23 21:09:32 2022
System time     : 0.000487002 seconds slow of NTP time
Last offset     : -0.000332960 seconds
RMS offset      : 0.000751204 seconds
Frequency       : 3.536 ppm fast
Residual freq   : +0.016 ppm
Skew            : 3.707 ppm
Root delay      : 0.013363549 seconds
Root dispersion : 0.000324015 seconds
Update interval : 65.0 seconds
Leap status     : Normal
Now it's learning how good or bad the RTC clock is ("Frequency"), and is smoothly adjusting the System time to follow the average offset (RMS offset, more or less). You'll also notice the Update interval has risen, and will keep expanding as chrony learns more about the internal clock, so it doesn't need to constantly poll the NTP servers to sync the clock. In the above, we're 487 micro seconds (less than a milisecond!) away from NTP time. (People interested in the explanation of every single one of those fields can read the excellent chronyc manpage. That thing made me want to nerd out on NTP again!) On the machine with the bad clock, chrony also did a 1.5 second adjustment, but just once, at startup:
jan 18 11:54:33 curie chronyd[2148399]: Selected source 206.108.0.133 (2.debian.pool.ntp.org) 
jan 18 11:54:33 curie chronyd[2148399]: System clock wrong by -1.606546 seconds 
jan 18 11:54:31 curie chronyd[2148399]: System clock was stepped by -1.606546 seconds 
jan 18 11:54:31 curie chronyd[2148399]: System clock TAI offset set to 37 seconds 
Then it would still struggle to keep the clock in sync, but not as badly as openntpd. Here's the offset a few minutes after that above startup:
System time     : 0.000375352 seconds slow of NTP time
And again a few seconds later:
System time     : 0.001793046 seconds slow of NTP time
I don't currently have access to that machine, and will update this post with the latest status, but so far I've had a very good experience with chrony on that machine, which is a testament to its resilience, and it also just works on my other machines as well.

Extras On top of "just working" (as demonstrated above), I feel that chrony's feature set is so much superior... Here's an excerpt of the extras in chrony, taken from the comparison table:
  • source frequency tracking
  • source state restore from file
  • temperature compensation
  • ready for next NTP era (year 2036)
  • replace unreachable / falseticker servers
  • aware of jitter
  • RTC drift tracking
  • RTC trimming
  • Restore time from file w/o RTC
  • leap seconds correction, in slew mode
  • drops root privileges
I even understand some of that stuff. I think. So kudos to the chrony folks, I'm switching.

Caveats One thing to keep in mind in the above, however is that it's quite possible chrony does as bad of a job as openntpd on that old machine, and just doesn't tell me about it. For example, here's another log sample from another server (marcos):
jan 23 11:13:25 marcos ntpd[1976694]: adjusting clock frequency by 0.451035 to -16.420273ppm
I get those basically every day, which seems to show that it's at least trying to keep track of the hardware clock. In other words, it's quite possible I have no idea what I'm talking about and you definitely need to take this article with a grain of salt. I'm not an NTP expert. Update: I should also mentioned that I haven't evaluated systemd-timesyncd, for a few reasons:
  1. I have enough things running under systemd
  2. I wasn't aware of it when I started writing this
  3. I couldn't find good documentation on it... later I found the above manpage and of course the Arch Wiki but that is very minimal
  4. therefore I can't tell how it compares with chrony or (open)ntpd, so I don't see an enticing reason to switch
It has a few things going for it though:
  • it's likely shipped with your distribution already
  • it drops privileges (possibly like chrony, unclear if it also has seccomp filters)
  • it's minimalist: it only does SNTP so not the server side
  • the status command is good enough that you can tell the clock frequency, precision, and so on (especially when compared to openntpd's ntpctl)
So I'm reserving judgement over it, but I'd certainly note that I'm always a little weary in trusting systemd daemons with the network, and would prefer to keep that attack surface to a minimum. Diversity is a good thing, in general, so I'll keep chrony for now. It would certainly nice to see it added to chrony's comparison table.

Switching to chrony Because the default configuration in chrony (at least as shipped in Debian) is sane (good default peers, no open network by default), installing it is as simple as:
apt install chrony
And because it somehow conflicts with openntpd, that also takes care of removing that cruft as well.

Update: Debian defaults So it seems like I managed to write this entire blog post without putting it in relation with the original reason I had to think about this in the first place, which is odd and should be corrected. This conversation came about on an IRC channel that mentioned that the ntp package (and upstream) is in bad shape in Debian. In that discussion, chrony and ntpsec were discussed as possible replacements, but when we had the discussion on chat, I mentioned I was using openntpd, and promptly realized I was actually unhappy with it. A friend suggested chrony, I tried it, and it worked amazingly, I switched, wrote this blog post, end of story. Except today (2022-02-07, two weeks later), I actually read that thread and realized that something happened in Debian I wasn't actually aware of. In bookworm, systemd-timesyncd was not only shipped, but it was installed by default, as it was marked as a hard dependency of systemd. That was "fixed" in systemd-247.9-2 (see bug 986651), but only by making the dependency a Recommends and marking it as Priority: important. So in effect, systemd-timesyncd became the default NTP daemon in Debian in bookworm, which I find somewhat surprising. timesyncd has many things going for it (as mentioned above), but I do find it a bit annoying that systemd is replacing all those utilities in such a way. I also wonder what is going to happen on upgrades. This is all a little frustrating too because there is no good comparison between the other NTP daemons and timesyncd anywhere. The chrony comparison table doesn't mention it, and an audit by the Core Infrastructure Initiative from 2017 doesn't mention it either, even though timesyncd was announced in 2014. (Same with this blog post from Facebook.)

31 December 2021

Russell Coker: Links December 2021

Wired magazine has many short documentary films on YouTube, this one about How Photography is Affecting Our Brains is particularly good [1]. Matt Blaze wrote an informative blog post about Faraday cages for phones [2]. It seems that the commercial shielded bags are all pretty good while doing it yourself with aluminium foil may get similar results or may get much worse results with no obvious difference in the quality of the wrapping. Aluminium foil doesn t protect that well and doesn t protect consistently. A metal biscuit tin performed quite well and consistently, so that s a cheap option for reducing signals. Umair Haque wrote an insightful article about the single word that describes most of the problems the world faces right now [3]. Forbes has an informative article about the early days of the Ford company when they doubled wages, it proves that they didn t do so to enable workders to afford cars but to avoid staff turnover (which is expensive) [4]. Also the Ford company had a fascistic approach to employees, controlling what they were allowed to do in their spare time if they wanted the bonus payment. The wages weren t doubled, there was a bonus payment that would double the salary if the employee was eligible for the bonus. One thing that Forbes gets wrong is that they claim that it was only having higher pay than other companies that provided a benefit and that a higher minimum wage wouldn t, the problem with that idea is that a higher minimum wage would discourage people from having multiple jobs and allow more families to not have the mother working (a condition for a man to get the Ford bonus was for his wife to not work). The WSJ has an interesting article about Intel s datacenter for running all the different configurations of CPUs that they have supported over the last 10 years for security tests [5]. My Thinkpad (which is less than 10yo) is vulnerable to one of the SPECTRE family of exploits as Intel hasn t released microcode to fix it, getting fixed microcode out for all the systems from major vendors like Lenovo would be a good idea if they want to improve their security. NPR has an interesting article about the correlation between support for Trump in counties of the US with lack of vaccination and Covid19 deaths [6]. No surprises, but it s good to see the graphs. Cory Doctorow wrote an interesting article on the lack of slack in the current American education system [7]. It s not that bad in Australia but we are unfortunately moving in the American direction. Teen Vogue has an insightful article about the problems with the focus on resilience [8], while resilience is good we should make it a higher priority to avoid putting people in situations where they need to be resiliant than on encouraging resilience.

6 December 2021

Matthias Klumpp: New things in AppStream 0.15

On the road to AppStream 1.0, a lot of items from the long todo list have been done so far only one major feature is remaining, external release descriptions, which is a tricky one to implement and specify. For AppStream 1.0 it needs to be present or be rejected though, as it would be a major change in how release data is handled in AppStream. Besides 1.0 preparation work, the recent 0.15 release and the releases before it come with their very own large set of changes, that are worth a look and may be interesting for your application to support. But first, for a change that affects the implementation and not the XML format: 1. Completely rewritten caching code Keeping all AppStream data in memory is expensive, especially if the data is huge (as on Debian and Ubuntu with their large repositories generated from desktop-entry files as well) and if processes using AppStream are long-running. The latter is more and more the case, not only does GNOME Software run in the background, KDE uses AppStream in KRunner and Phosh will use it too for reading form factor information. Therefore, AppStream via libappstream provides an on-disk cache that is memory-mapped, so data is only consuming RAM if we are actually doing anything with it. Previously, AppStream used an LMDB-based cache in the background, with indices for fulltext search and other common search operations. This was a very fast solution, but also came with limitations, LMDB s maximum key size of 511 bytes became a problem quite often, adjusting the maximum database size (since it has to be set at opening time) was annoyingly tricky, and building dedicated indices for each search operation was very inflexible. In addition to that, the caching code was changed multiple times in the past to allow system-wide metadata to be cached per-user, as some distributions didn t (want to) build a system-wide cache and therefore ran into performance issues when XML was parsed repeatedly for generation of a temporary cache. In addition to all that, the cache was designed around the concept of one cache for data from all sources , which meant that we had to rebuild it entirely if just a small aspect changed, like a MetaInfo file being added to /usr/share/metainfo, which was very inefficient. To shorten a long story, the old caching code was rewritten with the new concepts of caches not necessarily being system-wide and caches existing for more fine-grained groups of files in mind. The new caching code uses Richard Hughes excellent libxmlb internally for memory-mapped data storage. Unlike LMDB, libxmlb knows about the XML document model, so queries can be much more powerful and we do not need to build indices manually. The library is also already used by GNOME Software and fwupd for parsing of (refined) AppStream metadata, so it works quite well for that usecase. As a result, search queries via libappstream are now a bit slower (very much depends on the query, roughly 20% on average), but can be mmuch more powerful. The caching code is a lot more robust, which should speed up startup time of applications. And in addition to all of that, the AsPool class has gained a flag to allow it to monitor AppStream source data for changes and refresh the cache fully automatically and transparently in the background. All software written against the previous version of the libappstream library should continue to work with the new caching code, but to make use of some of the new features, software using it may need adjustments. A lot of methods have been deprecated too now. 2. Experimental compose support Compiling MetaInfo and other metadata into AppStream collection metadata, extracting icons, language information, refining data and caching media is an involved process. The appstream-generator tool does this very well for data from Linux distribution sources, but the tool is also pretty heavyweight with lots of knobs to adjust, an underlying database and a complex algorithm for icon extraction. Embedding it into other tools via anything else but its command-line API is also not easy (due to D s GC initialization, and because it was never written with that feature in mind). Sometimes a simpler tool is all you need, so the libappstream-compose library as well as appstreamcli compose are being developed at the moment. The library contains building blocks for developing a tool like appstream-generator while the cli tool allows to simply extract metadata from any directory tree, which can be used by e.g. Flatpak. For this to work well, a lot of appstream-generator s D code is translated into plain C, so the implementation stays identical but the language changes. Ultimately, the generator tool will use libappstream-compose for any general data refinement, and only implement things necessary to extract data from the archive of distributions. New applications (e.g. for new bundling systems and other purposes) can then use the same building blocks to implement new data generators similar to appstream-generator with ease, sharing much of the code that would be identical between implementations anyway. 2. Supporting user input controls Want to advertise that your application supports touch input? Keyboard input? Has support for graphics tablets? Gamepads? Sure, nothing is easier than that with the new control relation item and supports relation kind (since 0.12.11 / 0.15.0, details):
<supports>
  <control>pointing</control>
  <control>keyboard</control>
  <control>touch</control>
  <control>tablet</control>
</supports>
3. Defining minimum display size requirements Some applications are unusable below a certain window size, so you do not want to display them in a software center that is running on a device with a small screen, like a phone. In order to encode this information in a flexible way, AppStream now contains a display_length relation item to require or recommend a minimum (or maximum) display size that the described GUI application can work with. For example:
<requires>
  <display_length compare="ge">360</display_length>
</requires>
This will make the application require a display length greater or equal to 300 logical pixels. A logical pixel (also device independent pixel) is the amount of pixels that the application can draw in one direction. Since screens, especially phone screens but also screens on a desktop, can be rotated, the display_length value will be checked against the longest edge of a display by default (by explicitly specifying the shorter edge, this can be changed). This feature is available since 0.13.0, details. See also Tobias Bernard s blog entry on this topic. 4. Tags This is a feature that was originally requested for the LVFS/fwupd, but one of the great things about AppStream is that we can take very project-specific ideas and generalize them so something comes out of them that is useful for many. The new tags tag allows people to tag components with an arbitrary namespaced string. This can be useful for project-internal organization of applications, as well as to convey certain additional properties to a software center, e.g. an application could mark itself as featured in a specific software center only. Metadata generators may also add their own tags to components to improve organization. AppStream gives no recommendations as to how these tags are to be interpreted except for them being a strictly optional feature. So any meaning is something clients and metadata authors need to negotiate. It therefore is a more specialized usecase of the already existing custom tag, and I expect it to be primarily useful within larger organizations that produce a lot of software components that need sorting. For example:
<tags>
  <tag namespace="lvfs">vendor-2021q1</tag>
  <tag namespace="plasma">featured</tag>
</tags>
This feature is available since 0.15.0, details. 5. MetaInfo Creator changes The MetaInfo Creator (source) tool is a very simple web application that provides you with a form to fill out and will then generate MetaInfo XML to add to your project after you have answered all of its questions. It is an easy way for developers to add the required metadata without having to read the specification or any guides at all. Recently, I added support for the new control and display_length tags, resolved a few minor issues and also added a button to instantly copy the generated output to clipboard so people can paste it into their project. If you want to create a new MetaInfo file, this tool is the best way to do it! The creator tool will also not transfer any data out of your webbrowser, it is strictly a client-side application. And that is about it for the most notable changes in AppStream land! Of course there is a lot more, additional tags for the LVFS and content rating have been added, lots of bugs have been squashed, the documentation has been refined a lot and the library has gained a lot of new API to make building software centers easier. Still, there is a lot to do and quite a few open feature requests too. Onwards to 1.0!

30 November 2021

Russell Coker: Links November 2021

The Guardian has an amusing article by Sophie Elmhirst about Libertarians buying a cruise ship to make a seasteading project off the coast of Panama [1]. It turns out that you need permits etc to do this and maintaining a ship is expensive. Also you wouldn t want to mine cryptocurrency in a ship cabin as most cabins are small and don t have enough airconditioning to remain pleasant if you dump 1kW or more into the air. NPR has an interesting article about the reaction of the NRA to the Columbine shootings [2]. Seems that some NRA person isn t a total asshole and is sharing their private information, maybe they are dying and are worried about going to hell. David Brin wrote an insightful blog post about the singleton hypothesis where he covers some of the evidence of autocratic societies failing [3]. I think he makes a convincing point about a single centralised government for human society not being viable. But something like the EU on a world wide scale could work well. Ken Shirriff wrote an interesting blog post about reverse engineering the Yamaha DX7 synthesiser [4]. The New York Times has an interesting article about a Baboon troop that became less aggressive after the alpha males all died at once from tuberculosis [5]. They established a new more peaceful culture that has outlived the beta males who avoided tuberculosis. The Guardian has an interesting article about how sequencing the genomes of the entire population can save healthcare costs while improving the health of the population [6]. This is somthing wealthy countries should offer for free to the world population. At a bit under $1000 per test that s only about $7 trillion to test everyone, and of course the price should drop significantly if there were billions of tests being done. The Strategy Bridge has an interesting article about SciFi books that have useful portrayals of military strategy [7]. The co-author is Major General Mick Ryan of the Australian Army which is noteworthy as Major General is the second highest rank in use by the Australian Army at this time. Vice has an interesting article about the co-evolution of penises and vaginas and how a lot of that evolution is based on avoiding impregnation from rape [8]. Cory Doctorow wrote an insightful Medium article about the way that governments could force interoperability through purchasing power [9]. Cory Doctorow wrote an insightful article for Locus Magazine about imagining life after capitalism and how capitalism might be replaced [10]. We need a Star Trek future! Arstechnica has an informative article about new developmenet in the rowhammer category of security attacks on DRAM [11]. It seems that DDR4 with ECC is the best current mitigation technique and that DDR3 with ECC is harder to attack than non-ECC RAM. So the thing to do is use ECC on all workstations and avoid doing security critical things on laptops because they can t juse ECC RAM.

29 November 2021

Russ Allbery: Fall haul

It's been a while since I've posted one of these, and I also may have had a few moments of deciding to support authors by buying their books even if I'm not going to get a chance to read them soon. There's also a bit of work reading in here. Ryka Aoki Light from Uncommon Stars (sff)
Frederick R. Chromey To Measure the Sky (non-fiction)
Neil Gaiman, et al. Sandman: Overture (graphic novel)
Alix E. Harrow A Spindle Splintered (sff)
Jordan Ifueko Raybearer (sff)
Jordan Ifueko Redemptor (sff)
T. Kingfisher Paladin's Hope (sff)
TJ Klune Under the Whispering Door (sff)
Kiese Laymon How to Slowly Kill Yourself and Others in America (non-fiction)
Yuna Lee Fox You (romance)
Tim Mak Misfire (non-fiction)
Naomi Novik The Last Graduate (sff)
Shelley Parker-Chan She Who Became the Sun (sff)
Gareth L. Powell Embers of War (sff)
Justin Richer & Antonio Sanso OAuth 2 in Action (non-fiction)
Dean Spade Mutual Aid (non-fiction)
Lana Swartz New Money (non-fiction)
Adam Tooze Shutdown (non-fiction)
Bill Watterson The Essential Calvin and Hobbes (strip collection)
Bill Willingham, et al. Fables: Storybook Love (graphic novel)
David Wong Real-World Cryptography (non-fiction)
Neon Yang The Black Tides of Heaven (sff)
Neon Yang The Red Threads of Fortune (sff)
Neon Yang The Descent of Monsters (sff)
Neon Yang The Ascent to Godhood (sff)
Xiran Jay Zhao Iron Widow (sff)

16 November 2021

Paul Tagliamonte: Measuring the Power Output of my SDRs

Over the last few years, I ve often wondered what the true power output of my SDRs are. It s a question with a shocking amount of complexity in the response, due to a number of factors (mostly Frequency). The ranges given in spec sheets are often extremely vague, and if I m being honest with myself, not incredibly helpful for being able to determine what specific filters and amplifiers I ll need to get a clean signal transmitted.
Hey, heads up! - This post contains extremely unvalidated and back of the napkin quality work to understand how my equipment works. Hopefully this work can be of help to others, but please double check any information you need for your own work!
I was specifically interested in what gain output (in dBm) looks like across the frequency range in particular, how variable the output dBm is when I change frequencies. The second question I had was understanding how linear the output gain is when adjusting the requested gain from the radio. Does a 2 dB increase on a HackRF API mean 2 dB of gain in dBm, no matter what the absolute value of the gain stage is? I ve finally bit the bullet and undertaken work to characterize the hardware I do have, with some outdated laboratory equipment I found on eBay. Of course, if it s worth doing, it s worth overdoing, so I spent a bit of time automating a handful of components in order to collect the data that I need from my SDRs. I bought an HP 437B, which is the cutting edge of 30 years ago, but still accurate to within 0.01dBm. I paired this Power Meter with an Agilent 8481A Power Sensor (-30 dBm to 20 dBm from 10MHz to 18GHz). For some of my radios, I was worried about exceeding the 20 dBm mark, so I used a 20db attenuator while I waited for a higher power power sensor. Finally, I was able to find a GPIB to USB interface, and get that interface working with the GPIB Kernel driver on my system. With all that out of the way, I was able to write Go bindings to my HP 437B to allow for totally headless and automated control in sync with my SDR s RF output. This allowed me to script the transmission of a sine wave at a controlled amplitude across a defined gain range and frequency range and read the Power Sensor s measured dBm output to characterize the Gain across frequency and configured Gain.

HackRF Looking at configured Gain against output power, the requested gain appears to have a fairly linear relation to the output signal power. The measured dBm ranged between the sensor noise floor to approx +13dBm. The average standard deviation of all tested gain values over the frequency range swept was +/-2dBm, with a minimum standard deviation of +/-0.8dBm, and a maximum of +/-3dBm. When looking at output power over the frequency range swept, the HackRF contains a distinctive (and frankly jarring) ripple across the Frequency range, with a clearly visible jump in gain somewhere around 2.1GHz. I have no idea what is causing this massive jump in output gain, nor what is causing these distinctive ripples. I d love to know more if anyone s familiar with HackRF s RF internals!

PlutoSDR The power output is very linear when operating above -20dB TX channel gain, but can get quite erratic the lower the output power is configured. The PlutoSDR s output power is directly related to the configured power level, and is generally predictable once a minimum power level is reached. The measured dBm ranged from the noise floor to 3.39 dBm, with an average standard deviation of +/-1.98 dBm, a minimum standard deviation of +/-0.91 dBm and a maximum standard deviation of +/-3.37 dBm. Generally, the power output is quite stable, and looks to have very even and wideband gain control. There s a few artifacts, which I have not confidently isolated to the SDR TX gain, noise (transmit artifacts such as intermodulation) or to my test setup. They appear fairly narrowband, so I m not overly worried about them yet. If anyone has any ideas what this could be, I d very much appreciate understanding why they exist!

Ettus B210 The power output on the Ettus B210 is higher (in dBm) than any of my other radios, but it has a very odd quirk where the power becomes nonlinear somewhere around -55dB TX channel gain. After that point, adding gain has no effect on the measured signal output in dBm up to 0 dB gain. The measured dBm ranged from the noise floor to 18.31 dBm, with an average standard deviation of +/-2.60 dBm, a minimum of +/-1.39 dBm and a maximum of +/-5.82 dBm. When the Gain is somewhere around the noise floor, the measured gain is incredibly erratic, which throws the maximum standard deviation significantly. I haven t isolated that to my test setup or the radio itself. I m inclined to believe it s my test setup. The radio has a fairly even and wideband gain, and so long as you re operating between -70dB to -55dB, fairly linear as well.

Summary Of all my radios, the Ettus B210 has the highest output (in dBm) over the widest frequency range, but the HackRF is a close second, especially after the gain bump kicks in around 2.1GHz. The Pluto SDR feels the most predictable and consistent, but also a very low output, comparatively - right around 0 dBm.
Name Max dBm stdev dBm stdev min dBm stdev max dBm
HackRF +12.6 +/-2.0 +/-0.8 +/-3.0
PlutoSDR +3.3 +/-2.0 +/-0.9 +/-3.7
B210 +18.3 +/-2.6 +/-1.4 +/-6.0

13 November 2021

Ruby Team: Ruby transition and packaging hints #1 - Adjusting Ruby version in commands

This is the first part of a series of short posts about issues that came up during the Ruby 3.0 transition and how to fix them. Hopefully more team members will join in and add their input. During the Ruby 3.0 transition there are essentially two different Ruby versions with two different binaries available, /usr/bin/ruby2.7 and /usr/bin/ruby3.0, while /usr/bin/ruby points to the current default version, which is Ruby 2.7. In some cases the tests shipped by the source packages will use shell commands to run scripts or Ruby code. It is imparative that in these cases the Ruby executable is not invoked by /usr/bin/ruby or ruby, because this will point to Ruby 2.7 only and fail if the tests are invoked with Ruby version 3. The fix is to rely on RbConfig.ruby which will point to the absolute pathname of the ruby command for the current Ruby environment, e.g.
cmd = "# RbConfig.ruby  ..."
This issue appeard for example in ruby-byebug and ruby-backports.

26 October 2021

Russell Coker: Links October 2021

Bloomburg has an insightful article about Juniper, the NSA, and the compromise of Netscreen [1]. It was worse than we previously thought and the Chinese government was involved. Haaretz has an amusing story about security issues at a credit card company based on a series of major WTFs [2]. They used WhatsApp for communicating with customers (despite the lack of support from Facebook for issues like account compromise), stored it on a phone (they should have used a desktop PC), didn t lock the phone down (should have been in a locked case and bolted down like any other financial security device), and allowed it to get stolen. Fortunately the thief was only after a free phone not the financial data stored on it. David Brin wrote an insightful blog post Should facts and successes matter in economics? Or politics? [3] which is part of his series about challenging conservatives to bet on their policies. Vice has an interesting article about a normal-looking USB-C to Lightning cable that intercepts data transfer and sends it out via an embedded Wifi AP [4]. Getting that into such a small space is an impressive engineering feat. The vendor already has a YSB-A to lightning cable with such features for $120 [5]. That s too expensive to just leave them lying around and hope that someone with interesting data finds them, but it s also quite cheap for a targeted attack. Interesting article about tracking people via Bluetooth MAC address or device name [6]. Most of the research is based on a man riding a bike around Norway and passively sniffing Bluetooth transmissions. You can buy commercial devices that can receive Bluetooth from 1Km away. A recent version of Bluetooth has random Mac addresses but that still allows tracking by device name which for many people is their own name. Cory Doctorow has a good summary of the ways that Facebook is rotten [7]. It s worse than you think. In 2019 almost all Facebook s top Christian pages were run by foreign troll farms [8]. This is partly due to Christians being gullible, but Facebook is also to blame for this. Cornell has an interesting article about using CRISPR to identify the gender of chicken eggs before they hatch [9]. This means that instead of killing roosters hatched from eggs for egg production they can just put those eggs for eating and save some money. Another option would be to genetically engineer more sexual dimorphism into chickens as the real problem is that hens for laying eggs are too thin to be good for eating so if you could have a breed of chicken with thin hens and fat cocks then all eggs could be hatched and the chickens used. The article claims that this is an ethical benefit of not killing baby roosters, but really it s about saving 50 cents per egg. Umair Haque wrote an insightful article about why everything will get more expensive as the externalities dating back to the industrial revolution have to be paid for [9]. Alexei Navalny (the jailed Russian opposition politician who Putin tried to murder) wrote an insightful article about why corruption is at the root of most world problems and how to solve it [10]. Cory Doctorow wrote an insightful article about breaking in to the writing industry which can apply to starting in most careers [11]. The main point is that people who have established careers have knowledge about starting a career that s at best outdated and at most totally irrelevant. Learning from people who are at most one step ahead of you is probably best. Peter Wehner wrote an insightful article for The Atlantic about the way churches in the US are breaking apart due to political issues [12]. Similar things appear to be happening in Australia for the same reason, conservative fear based politics which directly opposes everything in the Bible about Jesus is taking over churches. On the positive side this should destroy churches and the way churches are currently going they should be destroyed. The Guardian has an article about the incidence of reinfection with Covid19 [13]. The current expectation is that people who aren t vaccinated will probably get it about every 16 months if it becomes endemic (as it has in the US and will do in Australia if conservatives have their way). If the mortality rate is 2% each time then an unvaccinated person could expect a 15% chance of dying over the course of 10 years if there is no cumulative damage. However if damage to the heart and lungs accumulates over multiple courses of the disease then the probability of death over 10 years could be a lot higher. Psyche has an interesting article by Professor Jan-Willem van Prooijeni about the way that conspiracy theories bypass rationality [14]. The way that entertaining stories bypass rationality is particularly concerning given the way Facebook and other social media are driven by clickbait.

21 September 2021

Russell Coker: Links September 2021

Matthew Garrett wrote an interesting and insightful blog post about the license of software developed or co-developed by machine-learning systems [1]. One of his main points is that people in the FOSS community should aim for less copyright protection. The USENIX ATC 21/OSDI 21 Joint Keynote Address titled It s Time for Operating Systems to Rediscover Hardware has some inssightful points to make [2]. Timothy Roscoe makes some incendiaty points but backs them up with evidence. Is Linux really an OS? I recommend that everyone who s interested in OS design watch this lecture. Cory Doctorow wrote an interesting set of 6 articles about Disneyland, ride pricing, and crowd control [3]. He proposes some interesting ideas for reforming Disneyland. Benjamin Bratton wrote an insightful article about how philosophy failed in the pandemic [4]. He focuses on the Italian philosopher Giorgio Agamben who has a history of writing stupid articles that match Qanon talking points but with better language skills. Arstechnica has an interesting article about penetration testers extracting an encryption key from the bus used by the TPM on a laptop [5]. It s not a likely attack in the real world as most networks can be broken more easily by other methods. But it s still interesting to learn about how the technology works. The Portalist has an article about David Brin s Startide Rising series of novels and his thought s on the concept of Uplift (which he denies inventing) [6]. Jacobin has an insightful article titled You re Not Lazy But Your Boss Wants You to Think You Are [7]. Making people identify as lazy is bad for them and bad for getting them to do work. But this is the first time I ve seen it described as a facet of abusive capitalism. Jacobin has an insightful article about free public transport [8]. Apparently there are already many regions that have free public transport (Tallinn the Capital of Estonia being one example). Fare free public transport allows bus drivers to concentrate on driving not taking fares, removes the need for ticket inspectors, and generally provides a better service. It allows passengers to board buses and trams faster thus reducing traffic congestion and encourages more people to use public transport instead of driving and reduces road maintenance costs. Interesting research from Israel about bypassing facial ID [9]. Apparently they can make a set of 9 images that can pass for over 40% of the population. I didn t expect facial recognition to be an effective form of authentication, but I didn t expect it to be that bad. Edward Snowden wrote an insightful blog post about types of conspiracies [10]. Kevin Rudd wrote an informative article about Sky News in Australia [11]. We need to have a Royal Commission now before we have our own 6th Jan event. Steve from Big Mess O Wires wrote an informative blog post about USB-C and 4K 60Hz video [12]. Basically you can t have a single USB-C hub do 4K 60Hz video and be a USB 3.x hub unless you have compression software running on your PC (slow and only works on Windows), or have DisplayPort 1.4 or Thunderbolt (both not well supported). All of the options are not well documented on online store pages so lots of people will get unpleasant surprises when their deliveries arrive. Computers suck. Steinar H. Gunderson wrote an informative blog post about GaN technology for smaller power supplies [13]. A 65W USB-C PSU that fits the usual wall wart form factor is an interesting development.

30 August 2021

Russell Coker: Links August 2021

Sciencealert has an interesting article on a game to combat misinformation by microdosing people [1]. The game seemed overly simplistic to me, but I guess I m not the target demographic. Research shows it to work. Vice has an interesting and amusing article about mass walkouts of underpaid staff in the US [2]. The way that corporations are fighting an increase in the minimum wage doesn t seem financially beneficial for them. An increase in the minimum wage means small companies have to increase salaries too and the ratio of revenue to payroll is probably worse for small companies. It seems that companies like McDonalds make oppressing their workers a higher priority than making a profit. Interesting article in Vice about how the company Shot Spotter (which determines the locations of gunshots by sound) forges evidence for US police [3]. All convictions based on Shot Spotter evidence should be declared mistrials. BitsNBites has an interesting article on the fundamental flaws of SIMD (Single Instruction Multiple Data) [4]. The Daily Dot has a disturbing article anbout the possible future of the QAnon movement [5]. Let s hope they become too busy fighting each other to hurt many innocent people. Ben Taylor wrote an interesting blog post suggesting that Web Assembly should be a default binary target [6]. I don t support that idea but I think that considering it is useful. Web assembly could be used more for non-web things and it would be a better option than Node.js for some things. There are also some interesting corner cases like games, Minecraft was written in Java and there s no reason that Web Assembly couldn t do the same things. Vice has an interesting article about the Phantom encrypted phone service that ran on Blackberry handsets [7]. Australia really needs legislation based on the US RICO law! Vice has an interesting article about an encrypted phone company run by drug dealers [8]. Apparently after making an encrypted phone system for their own use they decided to sell it to others and made millions of dollars. They could have run a successful legal business. Salon has an insightful interview with Michael Petersen about his research on fake news and people who share it because they need chaos [9]. Apparently low status people who are status seeking are a main contributor to this, they share fake news knowingly to spread chaos. A society with less inequality would have less problems with fake news. Salon has another insightful interview with Michael Petersen, about is later research on fake news as an evolutionary strategy [10]. People knowingly share fake news to mobilise their supporters and to signal allegiance to their group. The more bizarre the beliefs are the more strongly they signal allegiance. If an opposing group has a belief then they can show support for their group by having the opposite belief (EG by opposing vaccination if the other political side supports doctors). He also suggests that lying can be a way of establishing dominance, the more honest people are opposed by a lie the more dominant the liar may seem. Vice has an amusing article about how police took over the Encrochat encrypted phone network that was mostly used by criminals [11]. It s amusing to read of criminals getting taken down like this. It s also interesting to note that the authorities messed up by breaking the wipe facility which alerted the criminals that their security was compromised. The investigation could have continued for longer if they hadn t changed the functionality of compromised phones. A later vice article mentioned that the malware installed on Encrochat devices recorded MAC addresses of Wifi access points which was used to locate the phones even though they had the GPS hardware removed. Cory Doctorow wrote an insightful article for Locus about the insufficient necessity of interoperability [12]. The problem if monopolies is not just an inability to interoperate with other services or leave it s losing control over your life. A few cartel participants interoperating will be able to do all the bad things to us tha a single monopolist could do.

24 August 2021

Norbert Preining: OBS builds of KDE/Plasma for Debian/Bullseye Testing Unstable

With the release of Debian/Bullseye last week, we can now support the stable Debian release (Bullseye) as well as Testing and Unstable releases with our build of KDE/Plasma (frameworks, plasma, gears, related programs) on the OBS builds. We used this switch to also consistently support three architectures: amd64, i386, and aarch64. To give full details, I repeat (and update) instructions for all here: First of all, you need to add my OBS key say in /etc/apt/trusted.gpg.d/obs-npreining.asc and add a file /etc/apt/sources.lists.d/obs-npreining-kde.list, containing the following lines, replacing the DISTRIBUTION part with one of Debian_11 (for Bullseye), Debian_Testing, or Debian_Unstable:
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other-deps/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/frameworks/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/plasma522/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/apps2108/DISTRIBUTION/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other/DISTRIBUTION/ ./
Notes Bullseye and pinning These packages are shipped with Distribution: unstable and thus might have a too low pin to be upgraded to automatically. One can fix that by manually selecting the packages, or adjusting the pin priority. One example is given in the comment section below, copied here:
Package: *
Pin: origin download.opensuse.org
Pin-Priority: 950
in /etc/apt/preferences Older sets Note that older sets (plasma521, apps2012 and apps2008) don t support all the different variants and will disappear soon.) Enjoy.

31 July 2021

Russell Coker: Links July 2021

The News Tribune published an article in 2004 about the Dove of Oneness , a mentally ill woman who got thousands of people to believe her crazy ideas about NESARA [1]. In recent time the QANON conspiracy theory has drawn on the NESARA cult and encouraged it s believers to borrow money and spend it in the belief that all debts will be forgiven (something which was not part of NESARA). The Wikipedia page about NESARA (proposed US legislation that was never considered by the US congress) notes that the second edition of the book about it was titled Draining the Swamp: The NESARA Story Monetary and Fiscal Policy Reform . It seems like the Trump cult has been following that for a long time. David Brin (best-selling SciFi Author and NASA consultant) wrote an insightful blog post about the Tytler Calumny [2], which is the false claim that democracy inevitably fails because poor people vote themselves money. When really the failure is of corrupt rich people subverting the government processes to enrich themselves at the expense of their country. It s worth reading, and his entire blog is also worth reading. Cory Doctorow has an insightful article about his own battle with tobacco addiction and the methods that tobacco companies and other horrible organisations use to prevent honest discussion about legislation [3]. Cory Doctorow has an insightful article about consent theater which is describes how consent in most agreements between corporations and people is a fraud [4]. The new GDPR sounds good. The forum for the War Thunder game had a discussion on the accuracy of the Challenger 2 tank which ended up with a man who claims to be a UK tank commander posting part of a classified repair manual [5]. That s pretty amusing, and also good advertising for War Thunder. After reading about this I discovered that it s free on Steam and runs on Linux! Unfortunately it whinged about my video drivers and refused to run. Corey Doctorow has an insightful and well researched article about the way the housing market works in the US [6]. For house prices to increase conditions for renters need to be worse, that may work for home owners in the short term but then in the long term their children and grandchildren will end up renting.

30 June 2021

Aigars Mahinovs: Keeping it as simple as possible

You know that you've had the same server too long when they discontinue the entire class of servers you were using and you need to migrate to a new instance. And you know you've not done anything with that server (and the blog running on it) for too long when you have no idea how that thing is actually working. Its a good opportunity to start over from scratch, and a good motivation to the new thing as simply as humanly possible, or even simpler. So I am switching to a statically generated blog as well. Not sure what took me so long, but thank good the tooling has really improved since the last time I looked. It was as simple as picking Nikola, finding its import_feed plugin, changing the BLOG_RSS_LIMIT in my Django Mezzanine blog to a thousand (to export all posts via RSS/ATOM feed), fixing some bugs in the import_feed plugin, waiting a few minutes for the full feed to generate and to be imported, adjusting the config of the resulting site, posting that to git and writing a simple shell script to pull that repo periodically and call nikola build on it, as well as config to serve ther result via ngnix. Done. After that creating a new blog post is just nikola new_post and editing it in vim and pushing to git. I prefer Markdown, but it supports all kinds of formats. And the old posts are just stored as HTML. Really simple. I think I will spend more time fighting with Google to allow me to forward email from my domain to my GMail postbox without it refusing all of it as spam.

Russell Coker: Links June 2021

MIT Technology Review has an interesting article about Google Project Zero shutting down a western intelligence operation [1]. There s an Internet trend of people eating rotten meat they call high meat (rotten meat) [2]. This is up there with people setting themselves on fire and nut shot videos. A young female who was making popular Twitter posts about motorbikes turned out to be a 50yo man using deep fake technology [3]. He has long hair IRL and just needed to replace his face. After coming out of the closet he has continued making such videos and remains popular. FYHTECH has an informative blog post about using sgdisk to backup and restore GPT partition tables [4]. This is in the Debian package gdisk along with several other tools for managing partition tables. One interesting thing to note is that you can backup a partition table and restore to a smaller device (with a bunch of warnings that you can ignore if you know what you are doing). This is the only way I ve discovered to cleanly truncate a GPT partitioned disk, which is sometimes necessary when running VMs. Insightful blog post about PCIe bifurcation and how PCIe lanes are assigned to sockets [5]. This explains why many motherboards have sockets with unused PCIe lanes, EG *8 sockets that are wired for *4. The PCIe slots all go back to the CPU which has a limited number of *16 PCIe connections that are bifurcated to make the larger number of PCIe slots on the motherboard. New Republic has an interesting article on the infamous transphobe Jordan Peterson s battle with tranquiliser dependency [6]. Wired has an interesting article about the hack of RSA infrastructure related to the SecureID keys 10 years ago [7]. Apparently some 10 year NDAs had delayed it. There are many posts about the situation with Freenode, I think that this one best captures the problems in the shortest amount of text [8]. You could spend a few hours reading about it as I have just done, but just reading this gives you the basics that you need to know to avoid Freenode. That blog post has links to articles about Andrew Lee s involvement with Mt Gox and claims to be the heir to the throne of Korea (which is not a monarchy). Nicholas Wade wrote an insightful and informative article about the origin of Covid19, which leads to the conclusion that it was made in a Chinese laboratory [9]. I first saw this in David Brin s Facebook feed. I would be hesitant to share this sort of thing if it wasn t reviewed by a reliable source, I think David Brin has the skill to analyse this sort of article and the contacts to allow him to seek verification of any scientific issues that are outside his field. I believe that this article is reliable and it s conclusion is most likely to be correct. Interesting Wired article about an art project using display computers at Apple stores to photograph people [10]. Ends with a visit from the FBI.

28 April 2021

Russell Coker: Links April 2021

Dr Justin Lehmiller s blog post comparing his official (academic style) and real biographies is interesting [1]. Also the rest of his blog is interesting too, he works at the Kinsey Institute so you know he s good. Media Matters has an interesting article on the spread of vaccine misinformation on Instagram [2]. John Goerzen wrote a long post summarising some of the many ways of having a decentralised Internet [3]. One problem he didn t address is how to choose between them, I could spend months of work to setup a fraction of those services. Erasmo Acosta wrote an interesting medium article Could Something as Pedestrian as the Mitochondria Unlock the Mystery of the Great Silence? [4]. I don t know enough about biology to determine how plausible this is. But it is a worry, I hope that humans will meet extra-terrestrial intelligences at some future time. Meredith Haggerty wrote an insightful Medium article about the love vs money aspects of romantic comedies [5]. Changes in viewer demographics would be one factor that makes lead actors in romantic movies significantly less wealthy in recent times. Informative article about ZIP compression and the history of compression in general [6]. Vice has an insightful article about one way of taking over SMS access of phones without affecting voice call or data access [7]. With this method the victom won t notice that they are having their sservice interfered with until it s way too late. They also explain the chain of problems in the US telecommunications industry that led to this. I wonder what s happening in this regard in other parts of the world. The clown code of ethics (8 Commandments) is interesting [8]. Sam Hartman wrote an insightful blog post about the problems with RMS and how to deal with him [9]. Also Sam Whitton has an interesting take on this [10]. Another insightful post is by Selam G about RMS long history of bad behavior and the way universities are run [11]. Cory Doctorow wrote an insightful article for Locus about free markets with a focus on DRM on audio books [12]. We need legislative changes to fix this!

9 April 2021

Michael Prokop: A Ceph war story

It all started with the big bang! We nearly lost 33 of 36 disks on a Proxmox/Ceph Cluster; this is the story of how we recovered them. At the end of 2020, we eventually had a long outstanding maintenance window for taking care of system upgrades at a customer. During this maintenance window, which involved reboots of server systems, the involved Ceph cluster unexpectedly went into a critical state. What was planned to be a few hours of checklist work in the early evening turned out to be an emergency case; let s call it a nightmare (not only because it included a big part of the night). Since we have learned a few things from our post mortem and RCA, it s worth sharing those with others. But first things first, let s step back and clarify what we had to deal with. The system and its upgrade One part of the upgrade included 3 Debian servers (we re calling them server1, server2 and server3 here), running on Proxmox v5 + Debian/stretch with 12 Ceph OSDs each (65.45TB in total), a so-called Proxmox Hyper-Converged Ceph Cluster. First, we went for upgrading the Proxmox v5/stretch system to Proxmox v6/buster, before updating Ceph Luminous v12.2.13 to the latest v14.2 release, supported by Proxmox v6/buster. The Proxmox upgrade included updating corosync from v2 to v3. As part of this upgrade, we had to apply some configuration changes, like adjust ring0 + ring1 address settings and add a mon_host configuration to the Ceph configuration. During the first two servers reboots, we noticed configuration glitches. After fixing those, we went for a reboot of the third server as well. Then we noticed that several Ceph OSDs were unexpectedly down. The NTP service wasn t working as expected after the upgrade. The underlying issue is a race condition of ntp with systemd-timesyncd (see #889290). As a result, we had clock skew problems with Ceph, indicating that the Ceph monitors clocks aren t running in sync (which is essential for proper Ceph operation). We initially assumed that our Ceph OSD failure derived from this clock skew problem, so we took care of it. After yet another round of reboots, to ensure the systems are running all with identical and sane configurations and services, we noticed lots of failing OSDs. This time all but three OSDs (19, 21 and 22) were down:
% sudo ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.44138 root default
-2       21.81310     host server1
 0   hdd  1.08989         osd.0    down  1.00000 1.00000
 1   hdd  1.08989         osd.1    down  1.00000 1.00000
 2   hdd  1.63539         osd.2    down  1.00000 1.00000
 3   hdd  1.63539         osd.3    down  1.00000 1.00000
 4   hdd  1.63539         osd.4    down  1.00000 1.00000
 5   hdd  1.63539         osd.5    down  1.00000 1.00000
18   hdd  2.18279         osd.18   down  1.00000 1.00000
20   hdd  2.18179         osd.20   down  1.00000 1.00000
28   hdd  2.18179         osd.28   down  1.00000 1.00000
29   hdd  2.18179         osd.29   down  1.00000 1.00000
30   hdd  2.18179         osd.30   down  1.00000 1.00000
31   hdd  2.18179         osd.31   down  1.00000 1.00000
-4       21.81409     host server2
 6   hdd  1.08989         osd.6    down  1.00000 1.00000
 7   hdd  1.08989         osd.7    down  1.00000 1.00000
 8   hdd  1.63539         osd.8    down  1.00000 1.00000
 9   hdd  1.63539         osd.9    down  1.00000 1.00000
10   hdd  1.63539         osd.10   down  1.00000 1.00000
11   hdd  1.63539         osd.11   down  1.00000 1.00000
19   hdd  2.18179         osd.19     up  1.00000 1.00000
21   hdd  2.18279         osd.21     up  1.00000 1.00000
22   hdd  2.18279         osd.22     up  1.00000 1.00000
32   hdd  2.18179         osd.32   down  1.00000 1.00000
33   hdd  2.18179         osd.33   down  1.00000 1.00000
34   hdd  2.18179         osd.34   down  1.00000 1.00000
-3       21.81419     host server3
12   hdd  1.08989         osd.12   down  1.00000 1.00000
13   hdd  1.08989         osd.13   down  1.00000 1.00000
14   hdd  1.63539         osd.14   down  1.00000 1.00000
15   hdd  1.63539         osd.15   down  1.00000 1.00000
16   hdd  1.63539         osd.16   down  1.00000 1.00000
17   hdd  1.63539         osd.17   down  1.00000 1.00000
23   hdd  2.18190         osd.23   down  1.00000 1.00000
24   hdd  2.18279         osd.24   down  1.00000 1.00000
25   hdd  2.18279         osd.25   down  1.00000 1.00000
35   hdd  2.18179         osd.35   down  1.00000 1.00000
36   hdd  2.18179         osd.36   down  1.00000 1.00000
37   hdd  2.18179         osd.37   down  1.00000 1.00000
Our blood pressure increased slightly! Did we just lose all of our cluster? What happened, and how can we get all the other OSDs back? We stumbled upon this beauty in our logs:
kernel: [   73.697957] XFS (sdl1): SB stripe unit sanity check failed
kernel: [   73.698002] XFS (sdl1): Metadata corruption detected at xfs_sb_read_verify+0x10e/0x180 [xfs], xfs_sb block 0xffffffffffffffff
kernel: [   73.698799] XFS (sdl1): Unmount and run xfs_repair
kernel: [   73.699199] XFS (sdl1): First 128 bytes of corrupted metadata buffer:
kernel: [   73.699677] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
kernel: [   73.700205] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
kernel: [   73.700836] 00000020: 62 44 2b c0 e6 22 40 d7 84 3d e1 cc 65 88 e9 d8  bD+.."@..=..e...
kernel: [   73.701347] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
kernel: [   73.701770] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
ceph-disk[4240]: mount: /var/lib/ceph/tmp/mnt.jw367Y: mount(2) system call failed: Structure needs cleaning.
ceph-disk[4240]: ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', u'xfs', '-o', 'noatime,inode64', '--', '/dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.cdda39ed-5
ceph/tmp/mnt.jw367Y']' returned non-zero exit status 32
kernel: [   73.702162] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
kernel: [   73.702550] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
kernel: [   73.702975] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
kernel: [   73.703373] XFS (sdl1): SB validate failed with error -117.
The same issue was present for the other failing OSDs. We hoped, that the data itself was still there, and only the mounting of the XFS partitions failed. The Ceph cluster was initially installed in 2017 with Ceph jewel/10.2 with the OSDs on filestore (nowadays being a legacy approach to storing objects in Ceph). However, we migrated the disks to bluestore since then (with ceph-disk and not yet via ceph-volume what s being used nowadays). Using ceph-disk introduces these 100MB XFS partitions containing basic metadata for the OSD. Given that we had three working OSDs left, we decided to investigate how to rebuild the failing ones. Some folks on #ceph (thanks T1, ormandj + peetaur!) were kind enough to share how working XFS partitions looked like for them. After creating a backup (via dd), we tried to re-create such an XFS partition on server1. We noticed that even mounting a freshly created XFS partition failed:
synpromika@server1 ~ % sudo mkfs.xfs -f -i size=2048 -m uuid="4568c300-ad83-4288-963e-badcd99bf54f" /dev/sdc1
meta-data=/dev/sdc1              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
synpromika@server1 ~ % sudo mount /dev/sdc1 /mnt/ceph-recovery
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
cache_node_purge: refcount was 1, not zero (node=0x1d3c400)
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x18800/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x18800/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x24c00/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x24c00/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0xc400/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0xc400/0x1000
releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!found dirty buffer (bulk) on free list!bad magic number
bad magic number
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
releasing dirty buffer (bulk) to free list!mount: /mnt/ceph-recovery: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.
Ouch. This very much looked related to the actual issue we re seeing. So we tried to execute mkfs.xfs with a bunch of different sunit/swidth settings. Using -d sunit=512 -d swidth=512 at least worked then, so we decided to force its usage in the creation of our OSD XFS partition. This brought us a working XFS partition. Please note, sunit must not be larger than swidth (more on that later!). Then we reconstructed how to restore all the metadata for the OSD (activate.monmap, active, block_uuid, bluefs, ceph_fsid, fsid, keyring, kv_backend, magic, mkfs_done, ready, require_osd_release, systemd, type, whoami). To identify the UUID, we can read the data from ceph --format json osd dump , like this for all our OSDs (Zsh syntax ftw!):
synpromika@server1 ~ % for f in  0..37  ; printf "osd-$f: %s\n" "$(sudo ceph --format json osd dump   jq -r ".osds[]   select(.osd==$f)   .uuid")"
osd-0: 4568c300-ad83-4288-963e-badcd99bf54f
osd-1: e573a17a-ccde-4719-bdf8-eef66903ca4f
osd-2: 0e1b2626-f248-4e7d-9950-f1a46644754e
osd-3: 1ac6a0a2-20ee-4ed8-9f76-d24e900c800c
[...]
Identifying the corresponding raw device for each OSD UUID is possible via:
synpromika@server1 ~ % UUID="4568c300-ad83-4288-963e-badcd99bf54f"
synpromika@server1 ~ % readlink -f /dev/disk/by-partuuid/"$ UUID "
/dev/sdc1
The OSD s key ID can be retrieved via:
synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph auth get osd."$ OSD_ID " -f json 2>/dev/null   jq -r '.[]   .key'
AQCKFpZdm0We[...]
Now we also need to identify the underlying block device:
synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph osd metadata osd."$ OSD_ID " -f json   jq -r '.bluestore_bdev_partition_path'    
/dev/sdc2
With all of this, we reconstructed the keyring, fsid, whoami, block + block_uuid files. All the other files inside the XFS metadata partition are identical on each OSD. So after placing and adjusting the corresponding metadata on the XFS partition for Ceph usage, we got a working OSD hurray! Since we had to fix yet another 32 OSDs, we decided to automate this XFS partitioning and metadata recovery procedure. We had a network share available on /srv/backup for storing backups of existing partition data. On each server, we tested the procedure with one single OSD before iterating over the list of remaining failing OSDs. We started with a shell script on server1, then adjusted the script for server2 and server3. This is the script, as we executed it on the 3rd server. Thanks to this, we managed to get the Ceph cluster up and running again. We didn t want to continue with the Ceph upgrade itself during the night though, as we wanted to know exactly what was going on and why the system behaved like that. Time for RCA! Root Cause Analysis So all but three OSDs on server2 failed, and the problem seems to be related to XFS. Therefore, our starting point for the RCA was, to identify what was different on server2, as compared to server1 + server3. My initial assumption was that this was related to some firmware issues with the involved controller (and as it turned out later, I was right!). The disks were attached as JBOD devices to a ServeRAID M5210 controller (with a stripe size of 512). Firmware state:
synpromika@server1 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156
synpromika@server2 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.21.0-0112
Firmware Version = 4.680.00-8489
synpromika@server3 ~ % sudo storcli64 /c0 show all   grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156
This looked very promising, as server2 indeed runs with a different firmware version on the controller. But how so? Well, the motherboard of server2 got replaced by a Lenovo/IBM technician in January 2020, as we had a failing memory slot during a memory upgrade. As part of this procedure, the Lenovo/IBM technician installed the latest firmware versions. According to our documentation, some OSDs were rebuilt (due to the filestore->bluestore migration) in March and April 2020. It turned out that precisely those OSDs were the ones that survived the upgrade. So the surviving drives were created with a different firmware version running on the involved controller. All the other OSDs were created with an older controller firmware. But what difference does this make? Now let s check firmware changelogs. For the 24.21.0-0097 release we found this:
- Cannot create or mount xfs filesystem using xfsprogs 4.19.x kernel 4.20(SCGCQ02027889)
- xfs_info command run on an XFS file system created on a VD of strip size 1M shows sunit and swidth as 0(SCGCQ02056038)
Our XFS problem certainly was related to the controller s firmware. We also recalled that our monitoring system reported different sunit settings for the OSDs that were rebuilt in March and April. For example, OSD 21 was recreated and got different sunit settings:
WARN  server2.example.org  Mount options of /var/lib/ceph/osd/ceph-21      WARN - Missing: sunit=1024, Exceeding: sunit=512
We compared the new OSD 21 with an existing one (OSD 25 on server3):
synpromika@server2 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d21.mount   grep sunit
Options=rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota
synpromika@server3 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d25.mount   grep sunit
Options=rw,noatime,attr2,inode64,sunit=1024,swidth=512,noquota
Thanks to our documentation, we could compare execution logs of their creation:
% diff -u ceph-disk-osd-25.log ceph-disk-osd-21.log
-synpromika@server2 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdj --osd-id 25
+synpromika@server3 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdi --osd-id 21
[...]
-command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdj1
-meta-data=/dev/sdj1              isize=2048   agcount=4, agsize=6272 blks
[...]
+command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdi1
+meta-data=/dev/sdi1              isize=2048   agcount=4, agsize=6336 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
-data     =                       bsize=4096   blocks=25088, imaxpct=25
-         =                       sunit=128    swidth=64 blks
+data     =                       bsize=4096   blocks=25344, imaxpct=25
+         =                       sunit=64     swidth=64 blks
 naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
 log      =internal log           bsize=4096   blocks=1608, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
 realtime =none                   extsz=4096   blocks=0, rtextents=0
[...]
So back then, we even tried to track this down but couldn t make sense of it yet. But now this sounds very much like it is related to the problem we saw with this Ceph/XFS failure. We follow Occam s razor, assuming the simplest explanation is usually the right one, so let s check the disk properties and see what differs:
synpromika@server1 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
524288
262144
synpromika@server2 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
262144
262144
See the difference between server1 and server2 for identical disks? The getiomin option now reports something different for them:
synpromika@server1 ~ % sudo blockdev --getiomin /dev/sdk            
524288
synpromika@server1 ~ % cat /sys/block/sdk/queue/minimum_io_size
524288
synpromika@server2 ~ % sudo blockdev --getiomin /dev/sdk 
262144
synpromika@server2 ~ % cat /sys/block/sdk/queue/minimum_io_size
262144
It doesn t make sense that the minimum I/O size (iomin, AKA BLKIOMIN) is bigger than the optimal I/O size (ioopt, AKA BLKIOOPT). This leads us to Bug 202127 cannot mount or create xfs on a 597T device, which matches our findings here. But why did this XFS partition work in the past and fails now with the newer kernel version? The XFS behaviour change Now given that we have backups of all the XFS partition, we wanted to track down, a) when this XFS behaviour was introduced, and b) whether, and if so how it would be possible to reuse the XFS partition without having to rebuild it from scratch (e.g. if you would have no working Ceph OSD or backups left). Let s look at such a failing XFS partition with the Grml live system:
root@grml ~ # grml-version
grml64-full 2020.06 Release Codename Ausgehfuahangl [2020-06-24]
root@grml ~ # uname -a
Linux grml 5.6.0-2-amd64 #1 SMP Debian 5.6.14-2 (2020-06-09) x86_64 GNU/Linux
root@grml ~ # grml-hostname grml-2020-06
Setting hostname to grml-2020-06: done
root@grml ~ # exec zsh
root@grml-2020-06 ~ # dpkg -l xfsprogs util-linux
Desired=Unknown/Install/Remove/Purge/Hold
  Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
 / Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
 / Name           Version      Architecture Description
+++-==============-============-============-=========================================
ii  util-linux     2.35.2-4     amd64        miscellaneous system utilities
ii  xfsprogs       5.6.0-1+b2   amd64        Utilities for managing the XFS filesystem
There it s failing, no matter which mount option we try:
root@grml-2020-06 ~ # mount ./sdd1.dd /mnt
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
root@grml-2020-06 ~ # dmesg   tail -30
[...]
[   64.788640] XFS (loop1): SB stripe unit sanity check failed
[   64.788671] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff
[   64.788671] XFS (loop1): Unmount and run xfs_repair
[   64.788672] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   64.788673] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   64.788674] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   64.788675] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   64.788675] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   64.788675] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   64.788676] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   64.788677] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   64.788677] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   64.788679] XFS (loop1): SB validate failed with error -117.
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error.
32 root@grml-2020-06 ~ # dmesg   tail -1
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=512,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
32 root@grml-2020-06 ~ # dmesg   tail -14
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
[   80.751277] XFS (loop1): SB stripe unit sanity check failed
[   80.751323] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff 
[   80.751324] XFS (loop1): Unmount and run xfs_repair
[   80.751325] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   80.751327] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   80.751328] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   80.751330] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   80.751331] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   80.751331] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   80.751332] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   80.751333] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   80.751334] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   80.751338] XFS (loop1): SB validate failed with error -117.
Also xfs_repair doesn t help either:
root@grml-2020-06 ~ # xfs_info ./sdd1.dd
meta-data=./sdd1.dd              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@grml-2020-06 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!
attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.
With the SB stripe unit sanity check failed message, we could easily track this down to the following commit fa4ca9c:
% git show fa4ca9c5574605d1e48b7e617705230a0640b6da   cat
commit fa4ca9c5574605d1e48b7e617705230a0640b6da
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Jun 5 10:06:16 2018 -0700
    
    xfs: catch bad stripe alignment configurations
    
    When stripe alignments are invalid, data alignment algorithms in the
    allocator may not work correctly. Ensure we catch superblocks with
    invalid stripe alignment setups at mount time. These data alignment
    mismatches are now detected at mount time like this:
    
    XFS (loop0): SB stripe unit sanity check failed
    XFS (loop0): Metadata corruption detected at xfs_sb_read_verify+0xab/0x110, xfs_sb block 0xffffffffffffffff
    XFS (loop0): Unmount and run xfs_repair
    XFS (loop0): First 128 bytes of corrupted metadata buffer:
    0000000091c2de02: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 10 00  XFSB............
    0000000023bff869: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000cdd8c893: 17 32 37 15 ff ca 46 3d 9a 17 d3 33 04 b5 f1 a2  .27...F=...3....
    000000009fd2844f: 00 00 00 00 00 00 00 04 00 00 00 00 00 00 06 d0  ................
    0000000088e9b0bb: 00 00 00 00 00 00 06 d1 00 00 00 00 00 00 06 d2  ................
    00000000ff233a20: 00 00 00 01 00 00 10 00 00 00 00 01 00 00 00 00  ................
    000000009db0ac8b: 00 00 03 60 e1 34 02 00 08 00 00 02 00 00 00 00  ... .4..........
    00000000f7022460: 00 00 00 00 00 00 00 00 0c 09 0b 01 0c 00 00 19  ................
    XFS (loop0): SB validate failed with error -117.
    
    And the mount fails.
    
    Signed-off-by: Dave Chinner <dchinner@redhat.com>
    Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
diff --git fs/xfs/libxfs/xfs_sb.c fs/xfs/libxfs/xfs_sb.c
index b5dca3c8c84d..c06b6fc92966 100644
--- fs/xfs/libxfs/xfs_sb.c
+++ fs/xfs/libxfs/xfs_sb.c
@@ -278,6 +278,22 @@ xfs_mount_validate_sb(
                return -EFSCORRUPTED;
         
        
+       if (sbp->sb_unit)  
+               if (!xfs_sb_version_hasdalign(sbp)  
+                   sbp->sb_unit > sbp->sb_width  
+                   (sbp->sb_width % sbp->sb_unit) != 0)  
+                       xfs_notice(mp, "SB stripe unit sanity check failed");
+                       return -EFSCORRUPTED;
+                 
+         else if (xfs_sb_version_hasdalign(sbp))   
+               xfs_notice(mp, "SB stripe alignment sanity check failed");
+               return -EFSCORRUPTED;
+         else if (sbp->sb_width)  
+               xfs_notice(mp, "SB stripe width sanity check failed");
+               return -EFSCORRUPTED;
+        
+
+       
        if (xfs_sb_version_hascrc(&mp->m_sb) &&
            sbp->sb_blocksize < XFS_MIN_CRC_BLOCKSIZE)  
                xfs_notice(mp, "v5 SB sanity check failed");
This change is included in kernel versions 4.18-rc1 and newer:
% git describe --contains fa4ca9c5574605d1e48
v4.18-rc1~37^2~14
Now let s try with an older kernel version (4.9.0), using old Grml 2017.05 release:
root@grml ~ # grml-version
grml64-small 2017.05 Release Codename Freedatensuppe [2017-05-31]
root@grml ~ # uname -a
Linux grml 4.9.0-1-grml-amd64 #1 SMP Debian 4.9.29-1+grml.1 (2017-05-24) x86_64 GNU/Linux
root@grml ~ # lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 9.0 (stretch)
Release:        9.0
Codename:       stretch
root@grml ~ # grml-hostname grml-2017-05
Setting hostname to grml-2017-05: done
root@grml ~ # exec zsh
root@grml-2017-05 ~ #
root@grml-2017-05 ~ # xfs_info ./sdd1.dd
xfs_info: ./sdd1.dd is not a mounted XFS filesystem
1 root@grml-2017-05 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!
attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.
1 root@grml-2017-05 ~ # mount ./sdd1.dd /mnt
root@grml-2017-05 ~ # mount -t xfs
/root/sdd1.dd on /mnt type xfs (rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota)
root@grml-2017-05 ~ # ls /mnt
activate.monmap  active  block  block_uuid  bluefs  ceph_fsid  fsid  keyring  kv_backend  magic  mkfs_done  ready  require_osd_release  systemd  type  whoami
root@grml-2017-05 ~ # xfs_info /mnt
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Mounting there indeed works! Now, if we mount the filesystem with new and proper sunit/swidth settings using the older kernel, it should rewrite them on disk:
root@grml-2017-05 ~ # mount -t xfs -o sunit=512,swidth=512 ./sdd1.dd /mnt/
root@grml-2017-05 ~ # umount /mnt/
And indeed, mounting this rewritten filesystem then also works with newer kernels:
root@grml-2020-06 ~ # mount ./sdd1.rewritten /mnt/
root@grml-2020-06 ~ # xfs_info /root/sdd1.rewritten
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=64    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@grml-2020-06 ~ # mount -t xfs                
/root/sdd1.rewritten on /mnt type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=512,swidth=512,noquota)
FTR: The sunit=512,swidth=512 from the xfs mount option is identical to xfs_info s output sunit=64,swidth=64 (because mount.xfs s sunit value is given in 512-byte block units, see man 5 xfs, and the xfs_info output reported here is in blocks with a block size (bsize) of 4096, so sunit = 512*512 := 64*4096 ). mkfs uses minimum and optimal sizes for stripe unit and stripe width; you can check this e.g. via (note that server2 with fixed firmware version reports proper values, whereas server3 with broken controller firmware reports non-sense):
synpromika@server2 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 262144 262144
/sys/block/sdd/queue/: 262144 262144
/sys/block/sde/queue/: 262144 262144
/sys/block/sdf/queue/: 262144 262144
/sys/block/sdg/queue/: 262144 262144
/sys/block/sdh/queue/: 262144 262144
/sys/block/sdi/queue/: 262144 262144
/sys/block/sdj/queue/: 262144 262144
/sys/block/sdk/queue/: 262144 262144
/sys/block/sdl/queue/: 262144 262144
/sys/block/sdm/queue/: 262144 262144
/sys/block/sdn/queue/: 262144 262144
[...]
synpromika@server3 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 524288 262144
/sys/block/sdd/queue/: 524288 262144
/sys/block/sde/queue/: 524288 262144
/sys/block/sdf/queue/: 524288 262144
/sys/block/sdg/queue/: 524288 262144
/sys/block/sdh/queue/: 524288 262144
/sys/block/sdi/queue/: 524288 262144
/sys/block/sdj/queue/: 524288 262144
/sys/block/sdk/queue/: 524288 262144
/sys/block/sdl/queue/: 524288 262144
/sys/block/sdm/queue/: 524288 262144
/sys/block/sdn/queue/: 524288 262144
[...]
This is the underlying reason why the initially created XFS partitions were created with incorrect sunit/swidth settings. The broken firmware of server1 and server3 was the cause of the incorrect settings they were ignored by old(er) xfs/kernel versions, but treated as an error by new ones. Make sure to also read the XFS FAQ regarding How to calculate the correct sunit,swidth values for optimal performance . We also stumbled upon two interesting reads in RedHat s knowledge base: 5075561 + 2150101 (requires an active subscription, though) and #1835947. Am I affected? How to work around it? To check whether your XFS mount points are affected by this issue, the following command line should be useful:
awk '$3 == "xfs" print $2 ' /proc/self/mounts   while read mount ; do echo -n "$mount " ; xfs_info $mount   awk '$0 ~ "swidth" gsub(/.*=/,"",$2); gsub(/.*=/,"",$3); print $2,$3 '   awk '  if ($1 > $2) print "impacted"; else print "OK" ' ; done
If you run into the above situation, the only known solution to get your original XFS partition working again, is to boot into an older kernel version again (4.17 or older), mount the XFS partition with correct sunit/swidth settings and then boot back into your new system (kernel version wise). Lessons learned Thanks: Darshaka Pathirana, Chris Hofstaedtler and Michael Hanscho. Looking for help with your IT infrastructure? Let us know!

23 November 2020

Shirish Agarwal: White Hat Senior and Education

I had been thinking of doing a blog post on RCEP which China signed with 14 countries a week and a day back but this new story has broken and is being viraled a bit on the interwebs, especially twitter and is pretty much in our domain so thought would be better to do a blog post about it. Also, there is quite a lot packed so quite a bit of unpacking to do.

Whitehat, Greyhat and Blackhat For those of you who may not, there are actually three terms especially in computer science that one comes across. Those are white hats, grey hats and black hats. Now clinically white hats are like the fiery angels or the good guys who basically take permissions to try and find out weakness in an application, program, website, organization and so on and so forth. A somewhat dated reference to hacker could be Sandra Bullock (The Net 1995) , Sneakers (1992), Live Free or Die Hard (2007) . Of the three one could argue that Sandra was actually into viruses which are part of computer security but still she showed some bad-ass skills, but then that is what actors are paid to do  Sneakers was much more interesting for me because in that you got the best key which can unlock any lock, something like quantum computing is supposed to do. One could equate both the first movies in either as a White hat or a Grey hat . A Grey hat is more flexible in his/her moral values, and they are plenty of such people. For e.g. Julius Assange could be described as a Grey hat, but as you can see and understand those are moral issues.

A black hat on the other hand is one who does things for profit even if it harms the others. The easiest fictitious examples are all Die Hard series, all of them except the 4th one, all had bad guys or black hats. The 4th also had but is the odd one out as it had Matthew Farell (Justin Long) as a Grey hat hacker. In real life Kevin Mitnick, Kevin Poulsen, Robert Tappan Morris, George Hotz, Gary McKinnon are some examples of hackers, most of whom were black hats, most of them reformed into white hats and security specialists. There are many other groups and names but that perhaps is best for another day altogether. Now why am I sharing this. Because in all of the above, the people who are using and working with the systems have better than average understanding of systems and they arguably would be better than most people at securing their networks, systems etc. but as we shall see in this case there has been lots of issues in the company.

WhiteHat Jr. and 300 Million Dollars Before I start this, I would like to share that for me this suit in many ways seems to be similar to the suit filed against Krishnaraj Rao . Although the difference is that Krishnaraj Rao s case/suit is that it was in real estate while this one is in education although many things are similar to those cases but also differ in some obvious ways. For e.g. in the suit against Krishnaraj Rao, the plaintiff s first approached the High Court and then the Supreme Court. Of course Krishnaraj Rao won in the High Court and then in the SC plaintiff s agreed to Krishnaraj Rao s demands as they knew they could not win in SC. In that case, a compromise was reached by the plaintiff just before judgement was to be delivered. In this case, the plaintiff have directly approached the Delhi High Court. The charges against Mr. Poonia (the defendant in this case) are very much similar to those which were made in Krishnaraj Rao s suit hence won t be going into those details. They have claimed defamation and filed a 20 crore suit. The idea is basically to silence any whistle-blowers.

Fictional Character Wolf Gupta The first issue in this case or perhaps one of the most famous or infamous character is an unknown. While he has been reportedly hired by Google India, BJYU, Chandigarh. This has been reported by Yahoo News. I did a cursory search on LinkedIn to see if there indeed is a wolf gupta but wasn t able to find any person with such a name. I am not even talking the amount of money/salary the fictitious gentleman is supposed to have got and the various variations on the salary figures at different times and the different ads.

If I wanted to, I could have asked few of the kind souls whom I know are working in Google to see if they can find such a person using their own credentials but it probably would have been a waste of time. When you show a LinkedIn profile in your social media, it should come up in the results, in this case it doesn t. I also tried to find out if somehow BJYU was a partner to Google and came up empty there as well. There is another story done by Kan India but as I m not a subscriber, I don t know what they have written but the beginning of the story itself does not bode well. While I can understand marketing, there is a line between marketing something and being misleading. At least to me, all of the references shared seems misleading at least to me.

Taking down dissent One of the big no-nos at least from what I perceive, you cannot and should not take down dissent or critique. Indians, like most people elsewhere around the world, critique and criticize day and night. Social media like twitter, mastodon and many others would not exist in the place if criticisms are not there. In fact, one could argue that Twitter and most social media is used to drive engagements to a person, brand etc. It is even an official policy in Twitter. Now you can t drive engagements without also being open to critique and this is true of all the web, including of WordPress and me  . What has been happening is that whitehatjr with help of bjyu have been taking out content of people citing copyright violation which seems laughable. When citizens critique anything, we are obviously going to take the name of the product otherwise people would have to start using new names similar to how Tom Riddle was known as Dark Lord , Voldemort and He who shall not be named . There have been quite a few takedowns, I just provide one for reference, the rest of the takedowns would probably come in the ongoing suit/case.
Whitehat Jr. ad showing investors fighting

Now a brief synopsis of what the ad. is about. The ad is about a kid named Chintu who makes an app. The app. Is so good that investors come to his house and right in the lawn and start fighting each other. The parents are enjoying looking at the fight and to add to the whole thing there is also a nosy neighbor who has his own observations. Simply speaking, it is a juvenile ad but it works as most parents in India, as elsewhere are insecure.
Jihan critiquing the whitehatjr ad
Before starting, let me assure that I asked Jihan s parents if it s ok to share his ad on my blog and they agreed. What he has done is broken down the ad and showed how juvenile the ad is and using logic and humor as a template for the same. He does make sure to state that he does not know how the product is as he hasn t used it. His critique was about the ad and not the product as he hasn t used that.

The Website If you look at the website, sadly, most of the site only talks about itself rather than giving examples that people can look in detail. For e.g. they say they have few apps. on Google play-store but no link to confirm the same. The same is true of quite a few other things. In another ad a Paralympic star says don t get into sports and get into coding. Which athlete in their right mind would say that? And it isn t that we (India) are brimming with athletes at the international level. In the last outing which was had in 2016, India sent a stunning 117 athletes but that was an exception as we had the women s hockey squad which was of 16 women, and even then they were overshadowed in numbers by the bureaucratic and support staff. There was criticism about the staff bit but that is probably a story for another date. Most of the site doesn t really give much value and the point seems to be driving sales to their courses. This is pressurizing small kids as well as teenagers and better who are in the second and third year science-engineering whose parents don t get that it is advertising and it is fake and think that their kids are incompetent. So this pressurizes both small kids as well as those who are learning, doing in whatever college or educational institution . The teenagers more often than not are unable to tell/share with them that this is advertising and fake. Also most of us have been on a a good diet of ads. Fair and lovely still sells even though we know it doesn t work. This does remind me of a similar fake academy which used very much similar symptoms and now nobody remembers them today. There used to be an academy called Wings Academy or some similar name. They used to advertise that you come to us and we will make you into a pilot or an airhostess and it was only much later that it was found out that most kids were doing laundry work in hotels and other such work. Many had taken loans, went bankrupt and even committed suicide because they were unable to pay off the loans due to the dreams given by the company and the harsh realities that awaited them. They were sued in court but dunno what happened but soon they were off the radar so we never came to know what happened to those million of kids whose life dreams were shattered.

Security Now comes the security part. They have alleged that Mr. Poonia broke into their systems. While this may be true, what I find funny is that with the name Whitehat, how can they justify it? If you are saying you are white hat you are supposed to be much better than this. And while I have not tried to penetrate their systems, I did find it laughable that the site is using an expired https:// certificate. I could have tried further to figure out the systems but I chose not to. How they could not have an automated script to get the certificate fixed is beyond me, this is known as certificate outage and is very well understood in the industry. There are tools like Let s Encrypt and Certbot (both EFF) and many others. But that is their concern, not mine.

Comparison A similar offering would be unacademy but as can be seen they neither try to push you in any way and nor do they make any ridiculous claims. In fact how genuine unacademy is can be gauged from the fact that many of its learning resources are available to people to see on YT and if they have tools they can also download it. Now, does this mean that every educational website should have their content for free, of course not. But when a channel has 80% 90% of it YT content as ads and testimonials then they surely should give a reason to pause both for parents and students alike. But if parents had done that much research, then things would not be where they are now.

Allegations Just to complete, there are allegations by Mr. Poonia with some screenshots which show the company has been doing a lot of bad things. For e.g. they were harassing an employee at night 2 a.m. who was frustrated and working in the company at the time. Many of the company staff routinely made sexist and offensive, sexual abusive remarks privately between themselves for prospective women who came to interview via webcam (due to the pandemic). There also seems to be a bit of porn on the web/mobile server of the company as well. There also have been allegations that while the company says refund is done next day, many parents who have demanded those refunds have not got it. Now while Mr. Poonia has shared some quotations of the staff while hiding the identities of both the victims and the perpetrators, the language being used in itself tells a lot. I am in two minds whether to share those photos or not hence atm choosing not to. Poonia has also contended that all teachers do not know programming, and they are given scripts to share. There have been some people who did share that experience with him
Suruchi Sethi
From the company s side they are alleging he has hacked the company servers and would probably be using the Fruit of the poisonous tree argument which we have seen have been used in many arguments.

Conclusion Now that lies in the eyes of the Court whether the single bench chooses the literal meaning or use the spirit of the law or the genuine concerns of the people concerned. While in today s hearing while the company asked for a complete sweeping injunction they were unable to get it. Whatever may happen, we may hope to see some fireworks in the second hearing which is slated to be on 6.01.2021 where all of this plays out. Till later.

18 July 2020

Ritesh Raj Sarraf: Laptop Mode Tools 1.74

Laptop Mode Tools 1.74 Laptop Mode Tools version 1.74 has been released. This release includes important bug fixes, some defaults settings updated to current driver support in Linux and support for devices with nouveau based nVIDIA cards. A filtered list of changes is mentioned below. For the full log, please refer to the git repository

1.74 - Sat Jul 18 19:10:40 IST 2020
* With 4.15+ kernels, Linux Intel SATA has a better link power
  saving policy, med_power_with_dipm, which should be the recommended
  one to use
* Disable defaults for syslog logging
* Initialize LM_VERBOSE with default to disabled
* Merge pull request #157 from rickysarraf/nouveau
* Add power saving module for nouveau cards
* Disable ethernet module by default
* Add board-specific folder and documentation
* Add execute bit on module radeon-dpm
* Drop unlock because there is no lock acquired

Resources

What is Laptop Mode Tools
Description: Tools for Power Savings based on battery/AC status
 Laptop mode is a Linux kernel feature that allows your laptop to save
 considerable power, by allowing the hard drive to spin down for longer
 periods of time. This package contains the userland scripts that are
 needed to enable laptop mode.
 .
 It includes support for automatically enabling laptop mode when the
 computer is working on batteries. It also supports various other power
 management features, such as starting and stopping daemons depending on
 power mode, automatically hibernating if battery levels are too low, and
 adjusting terminal blanking and X11 screen blanking
 .
 laptop-mode-tools uses the Linux kernel's Laptop Mode feature and thus
 is also used on Desktops and Servers to conserve power

Next.

Previous.